A survey on graphic processing unit computing for large-scale data mining
نویسنده
چکیده
General purpose computation using Graphic Processing Units (GPUs) is a wellestablished research area focusing on high-performance computing solutions for massively parallelizable and time-consuming problems. Classical methodologies in machine learning and data mining cannot handle processing of massive and high-speed volumes of information in the context of the big data era. GPUs have successfully improved the scalability of data mining algorithms to address significantly larger dataset sizes in many application areas. The popularization of distributed computing frameworks for big data mining opens up new opportunities for transformative solutions combining GPUs and distributed frameworks. This survey analyzes current trends in the use of GPU computing for large-scale data mining, discusses GPU architecture advantages for handling volume and velocity of data, identifies limitation factors hampering the scalability of the problems, and discusses open issues and future directions. © 2017 Wiley Periodicals, Inc.
منابع مشابه
Mining Massive-Scale Spatiotemporal Trajectories in Parallel: A Survey
With the popularization of positioning devices such as GPS navigators and smart phones, large volumes of spatiotemporal trajectory data have been produced at unprecedented speed. For many trajectory mining problems, a number of computationally efficient approaches have been proposed. However, to more effectively tackle the challenge of big data, it is important to exploit various advanced paral...
متن کاملFast Cellular Automata Implementation on Graphic Processor Unit (GPU) for Salt and Pepper Noise Removal
Noise removal operation is commonly applied as pre-processing step before subsequent image processing tasks due to the occurrence of noise during acquisition or transmission process. A common problem in imaging systems by using CMOS or CCD sensors is appearance of the salt and pepper noise. This paper presents Cellular Automata (CA) framework for noise removal of distorted image by the salt an...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملEfficient Data Mining with Evolutionary Algorithms for Cloud Computing Application
With the rapid development of the internet, the amount of information and data which are produced, are extremely massive. Hence, client will be confused with huge amount of data, and it is difficult to understand which ones are useful. Data mining can overcome this problem. While data mining is using on cloud computing, it is reducing time of processing, energy usage and costs. As the speed of ...
متن کاملParallel Implementations of Probabilistic Latent Semantic
Probabilistic Latent Semantic Analysis (PLSA) has been successfully applied to many text mining tasks such as retrieval, clustering, summarization, etc. PLSA involves iterative computation for a large number of parameters and may take hours or even days to process a large dataset, thus speeding up PLSA is highly motivated in the domain of text mining. Recently, the general purpose graphic proce...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery
دوره 8 شماره
صفحات -
تاریخ انتشار 2018